{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WfGpInivS0fG"
   },
   "source": [
    "<h2 align=\"center\">点击下列图标在线运行HanLP</h2>\n",
    "<div align=\"center\">\n",
    "\t<a href=\"https://colab.research.google.com/github/hankcs/HanLP/blob/doc-zh/plugins/hanlp_demo/hanlp_demo/zh/pos_restful.ipynb\" target=\"_blank\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\t<a href=\"https://mybinder.org/v2/gh/hankcs/HanLP/doc-zh?filepath=plugins%2Fhanlp_demo%2Fhanlp_demo%2Fzh%2Fpos_restful.ipynb\" target=\"_blank\"><img src=\"https://mybinder.org/badge_logo.svg\" alt=\"Open In Binder\"/></a>\n",
    "</div>\n",
    "\n",
    "## 安装"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IYwV-UkNNzFp"
   },
   "source": [
    "无论是Windows、Linux还是macOS，HanLP的安装只需一句话搞定："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "1Uf_u7ddMhUt"
   },
   "outputs": [],
   "source": [
    "pip install hanlp_restful -U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pp-1KqEOOJ4t"
   },
   "source": [
    "## 创建客户端"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "id": "0tmKBu7sNAXX"
   },
   "outputs": [],
   "source": [
    "from hanlp_restful import HanLPClient\n",
    "HanLP = HanLPClient('https://www.hanlp.com/api', auth=None, language='zh') # auth不填则匿名，zh中文，mul多语种"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EmZDmLn9aGxG"
   },
   "source": [
    "#### 申请秘钥\n",
    "由于服务器算力有限，匿名用户每分钟限2次调用。如果你需要更多调用次数，[建议申请免费公益API秘钥auth](https://bbs.hanlp.com/t/hanlp2-1-restful-api/53)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "elA_UyssOut_"
   },
   "source": [
    "## 词性标注\n",
    "任务越少，速度越快。如指定仅执行词性标注，默认CTB标准："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 70
    },
    "id": "BqEmDMGGOtk3",
    "outputId": "2a0d392f-b99a-4a18-fc7f-754e2abe2e34"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div style=\"display: table; padding-bottom: 1rem;\"><pre style=\"display: table-cell; font-family: SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace; white-space: nowrap; line-height: 128%; padding: 0;\">HanLP/NR&nbsp;为/P&nbsp;生产/NN&nbsp;环境/NN&nbsp;带来/VV&nbsp;次世代/NN&nbsp;最/AD&nbsp;先进/JJ&nbsp;的/DEG&nbsp;多/CD&nbsp;语种/NN&nbsp;NLP/NN&nbsp;技术/NN&nbsp;。/PU</pre></div><br><div style=\"display: table; padding-bottom: 1rem;\"><pre style=\"display: table-cell; font-family: SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace; white-space: nowrap; line-height: 128%; padding: 0;\">我/PN&nbsp;的/DEG&nbsp;希望/NN&nbsp;是/VC&nbsp;希望/VV&nbsp;张晚霞/NR&nbsp;的/DEG&nbsp;背影/NN&nbsp;被/LB&nbsp;晚霞/NN&nbsp;映红/VV&nbsp;。/PU</pre></div>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "HanLP('HanLP为生产环境带来次世代最先进的多语种NLP技术。我的希望是希望张晚霞的背影被晚霞映红。', tasks='pos').pretty_print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "jj1Jk-2sPHYx"
   },
   "source": [
    "注意上面两个“希望”的词性各不相同，一个是名词另一个是动词。\n",
    "\n",
    "### 执行PKU词性标注"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 70
    },
    "id": "1goEC7znPNkI",
    "outputId": "7a3fde55-7577-49eb-92c8-48146aaa89d3"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div style=\"display: table; padding-bottom: 1rem;\"><pre style=\"display: table-cell; font-family: SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace; white-space: nowrap; line-height: 128%; padding: 0;\">HanLP/nx&nbsp;为/p&nbsp;生产/vn&nbsp;环境/n&nbsp;带来/v&nbsp;次世代/n&nbsp;最/d&nbsp;先进/a&nbsp;的/u&nbsp;多/a&nbsp;语种/n&nbsp;NLP/nx&nbsp;技术/n&nbsp;。/w</pre></div><br><div style=\"display: table; padding-bottom: 1rem;\"><pre style=\"display: table-cell; font-family: SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace; white-space: nowrap; line-height: 128%; padding: 0;\">我/r&nbsp;的/u&nbsp;希望/n&nbsp;是/v&nbsp;希望/v&nbsp;张晚霞/nr&nbsp;的/u&nbsp;背影/n&nbsp;被/p&nbsp;晚霞/n&nbsp;映红/v&nbsp;。/w</pre></div>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "HanLP('HanLP为生产环境带来次世代最先进的多语种NLP技术。我的希望是希望张晚霞的背影被晚霞映红。', tasks='pos/pku').pretty_print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 执行粗颗粒度分词和PKU词性标注"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div style=\"display: table; padding-bottom: 1rem;\"><pre style=\"display: table-cell; font-family: SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace; white-space: nowrap; line-height: 128%; padding: 0;\">阿婆主/n&nbsp;来到/v&nbsp;北京立方庭/ns&nbsp;参观/v&nbsp;自然语义科技公司/n&nbsp;。/w</pre></div>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "HanLP('阿婆主来到北京立方庭参观自然语义科技公司。', tasks=['tok/coarse', 'pos/pku'], skip_tasks='tok/fine').pretty_print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "举一反三，你可以指定其他pos标注集（ctb、863等）。用户有多聪明，HanLP就有多强大。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "wxctCigrTKu-"
   },
   "source": [
    "### 同时执行所有标准的词性标注"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "Zo08uquCTFSk",
    "outputId": "c6077f2d-7084-4f4b-a3bc-9aa9951704ea"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"tok/fine\": [\n",
      "    [\"HanLP\", \"为\", \"生产\", \"环境\", \"带来\", \"次世代\", \"最\", \"先进\", \"的\", \"多\", \"语种\", \"NLP\", \"技术\", \"。\"],\n",
      "    [\"我\", \"的\", \"希望\", \"是\", \"希望\", \"张晚霞\", \"的\", \"背影\", \"被\", \"晚霞\", \"映红\", \"。\"]\n",
      "  ],\n",
      "  \"pos/ctb\": [\n",
      "    [\"NR\", \"P\", \"NN\", \"NN\", \"VV\", \"NN\", \"AD\", \"JJ\", \"DEG\", \"CD\", \"NN\", \"NN\", \"NN\", \"PU\"],\n",
      "    [\"PN\", \"DEG\", \"NN\", \"VC\", \"VV\", \"NR\", \"DEG\", \"NN\", \"LB\", \"NN\", \"VV\", \"PU\"]\n",
      "  ],\n",
      "  \"pos/pku\": [\n",
      "    [\"nx\", \"p\", \"vn\", \"n\", \"v\", \"n\", \"d\", \"a\", \"u\", \"a\", \"n\", \"nx\", \"n\", \"w\"],\n",
      "    [\"r\", \"u\", \"n\", \"v\", \"v\", \"nr\", \"u\", \"n\", \"p\", \"n\", \"v\", \"w\"]\n",
      "  ],\n",
      "  \"pos/863\": [\n",
      "    [\"w\", \"p\", \"v\", \"n\", \"v\", \"n\", \"d\", \"a\", \"u\", \"a\", \"n\", \"w\", \"n\", \"w\"],\n",
      "    [\"r\", \"u\", \"v\", \"vl\", \"v\", \"nh\", \"u\", \"n\", \"p\", \"n\", \"v\", \"w\"]\n",
      "  ]\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "print(HanLP('HanLP为生产环境带来次世代最先进的多语种NLP技术。我的希望是希望张晚霞的背影被晚霞映红。', tasks='pos*'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以`pos`开头的字段为词性，以`tok`开头的第一个数组为单词，两者按下标一一对应。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XOsWkOqQfzlr"
   },
   "source": [
    "### 为已分词的句子执行词性标注"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 70
    },
    "id": "bLZSTbv_f3OA",
    "outputId": "111c0be9-bac6-4eee-d5bd-a972ffc34844"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div style=\"display: table; padding-bottom: 1rem;\"><pre style=\"display: table-cell; font-family: SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace; white-space: nowrap; line-height: 128%; padding: 0;\">HanLP/NR&nbsp;为/P&nbsp;生产环境/NN&nbsp;带来/VV&nbsp;次世代/NN&nbsp;最/AD&nbsp;先进/JJ&nbsp;的/DEG&nbsp;多语种/NN&nbsp;NLP/NN&nbsp;技术/NN&nbsp;。/PU</pre></div><br><div style=\"display: table; padding-bottom: 1rem;\"><pre style=\"display: table-cell; font-family: SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace; white-space: nowrap; line-height: 128%; padding: 0;\">我/PN&nbsp;的/DEG&nbsp;希望/NN&nbsp;是/VC&nbsp;希望/VV&nbsp;张晚霞/NR&nbsp;的/DEG&nbsp;背影/NN&nbsp;被/LB&nbsp;晚霞/NN&nbsp;映红/VV&nbsp;。/PU</pre></div>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "HanLP(tokens=[\n",
    "    [\"HanLP\", \"为\", \"生产环境\", \"带来\", \"次世代\", \"最\", \"先进\", \"的\", \"多语种\", \"NLP\", \"技术\", \"。\"],\n",
    "    [\"我\", \"的\", \"希望\", \"是\", \"希望\", \"张晚霞\", \"的\", \"背影\", \"被\", \"晚霞\", \"映红\", \"。\"]\n",
    "  ], tasks='pos').pretty_print()"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "pos_restful.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}